Identification of Genes for Complex Diseases Using Integrated Analysis of Multiple Types of Genomic Data

نویسندگان

  • Hongbao Cao
  • Shufeng Lei
  • Hong-Wen Deng
  • Yu-Ping Wang
چکیده

Various types of genomic data (e.g., SNPs and mRNA transcripts) have been employed to identify risk genes for complex diseases. However, the analysis of these data has largely been performed in isolation. Combining these multiple data for integrative analysis can take advantage of complementary information and thus can have higher power to identify genes (and/or their functions) that would otherwise be impossible with individual data analysis. Due to the different nature, structure, and format of diverse sets of genomic data, multiple genomic data integration is challenging. Here we address the problem by developing a sparse representation based clustering (SRC) method for integrative data analysis. As an example, we applied the SRC method to the integrative analysis of 376821 SNPs in 200 subjects (100 cases and 100 controls) and expression data for 22283 genes in 80 subjects (40 cases and 40 controls) to identify significant genes for osteoporosis (OP). Comparing our results with previous studies, we identified some genes known related to OP risk (e.g., 'THSD4', 'CRHR1', 'HSD11B1', 'THSD7A', 'BMPR1B' 'ADCY10', 'PRL', 'CA8','ESRRA', 'CALM1', 'CALM1', 'SPARC', and 'LRP1'). Moreover, we uncovered novel osteoporosis susceptible genes ('DICER1', 'PTMA', etc.) that were not found previously but play functionally important roles in osteoporosis etiology from existing studies. In addition, the SRC method identified genes can lead to higher accuracy for the diagnosis/classification of osteoporosis subjects when compared with the traditional T-test and Fisher-exact test, which further validates the proposed SRC approach for integrative analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Alzheimer disease-relevant genes using a novel hybrid method

Identifying genes underlying complex diseases/traits that generally involve multiple etiological mechanisms and contributing genes is difficult. Although microarray technology has enabled researchers to investigate gene expression changes, but identifying pathobiologically relevant genes remains a challenge. To address this challenge, we apply a new method for selecting the disease-relevant gen...

متن کامل

Identification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis

Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...

متن کامل

IDENTIFICATION, ISOLATION, CLONING AND SEQUENCING APARTIALANNEXIN GENE FROM AUREOBASIDIUM PULLULANS

Background and Objectives: Annexin is the common name for genes and proteins that were identified as calcium-dependent phospholipid-binding proteins. Recently a more complex set of functions has been recognized for this superfamily of proteins including in vesicle trafficking, cell division, apoptosis, calcium signalling, mineralization, crystal nucleation inside the extracellular organelle...

متن کامل

Identification of genomic loci controlling phenologic and morphologic traits in barley (Hordeum vulgare L.) genotypes using association analysis

Association mapping is a technique with high resolution for QTL mapping based on linkage disequilibrium and has shown more promising for describing genetically complex traits. In addition, it is a powerful tool for describing complex agronomic traits and identifying alleles that can contribute to enhance the desired traits. In this study, whole genome association mapping was used in a set of 14...

متن کامل

Dev105320 1..8

Genomic imprinting is a complex genetic and epigenetic phenomenon that plays important roles in mammalian development and diseases. Mammalian imprinted genes have been identified widely by experimental strategies or predicted using computational methods. Systematic information for these genes would be necessary for the identification of novel imprinted genes and the analysis of their regulatory...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012